2 research outputs found

    Unsupervised Domain Adaptation for Semantic Segmentation using One-shot Image-to-Image Translation via Latent Representation Mixing

    Full text link
    Domain adaptation is one of the prominent strategies for handling both domain shift, that is widely encountered in large-scale land use/land cover map calculation, and the scarcity of pixel-level ground truth that is crucial for supervised semantic segmentation. Studies focusing on adversarial domain adaptation via re-styling source domain samples, commonly through generative adversarial networks, have reported varying levels of success, yet they suffer from semantic inconsistencies, visual corruptions, and often require a large number of target domain samples. In this letter, we propose a new unsupervised domain adaptation method for the semantic segmentation of very high resolution images, that i) leads to semantically consistent and noise-free images, ii) operates with a single target domain sample (i.e. one-shot) and iii) at a fraction of the number of parameters required from state-of-the-art methods. More specifically an image-to-image translation paradigm is proposed, based on an encoder-decoder principle where latent content representations are mixed across domains, and a perceptual network module and loss function is further introduced to enforce semantic consistency. Cross-city comparative experiments have shown that the proposed method outperforms state-of-the-art domain adaptation methods. Our source code will be available at \url{https://github.com/Sarmadfismael/LRM_I2I}

    Crowd counting via joint SASNet and a guided batch normalization network

    No full text
    Recent studies on crowd counting have achieved promising results by using convolutional neural network (CNN) architectures. However, due to the large variation in scene distribution in real-world crowd datasets, it remains a challenge to achieve high performance using standard CNN methods. Such methods often suffer from performance drops. To address this challenge, this paper proposes a new crowd-counting approach that combines three state-of-the-art methods: Guided-Batch-Normalization, which adapts the model using unseen dataset normalization parameters; the Scale Adaptive Selection Network, which uses a multi-level network to obtain variation feature representations; and Distribution-Matching-Count, which uses a new loss function between prediction and ground truth maps. Combining these methods results in improved performance. Extensive experiments across multiple datasets have demonstrated that the proposed approach outperforms state-of-the-art methods
    corecore